Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.
نویسندگان
چکیده
A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-terminal sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and "other" localizations with a success rate of 85% (plant) or 90% (non-plant) on redundancy-reduced test sets. From a TargetP analysis of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens protein set, we estimate that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%. TargetP also predicts cleavage sites with levels of correctly predicted sites ranging from approximately 40% to 50% (chloroplastic and mitochondrial presequences) to above 70% (secretory signal peptides). TargetP is available as a web-server at http://www.cbs.dtu.dk/services/TargetP/.
منابع مشابه
MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition
MOTIVATION Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a protein's subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting proces...
متن کاملUsing N-terminal targeting sequences, amino acid composition, and sequence motifs for predicting protein subcellular localizations
Functional annotation of unknown proteins is a major goal in proteomics. A key step in this annotation process is the definition of a protein’s subcellular localization. As a consequence, numerous prediction techniques for localization have been developed over the years. These methods typically focus on a single underlying biological aspect or predict a subset of all possible subcellular locali...
متن کاملPredicting the Subcellular Localization of Human Proteins Using Machine Learning and Exploratory Data Analysis
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted ...
متن کاملPredicting Subcellular Localization of Proteins by Bioinformatic Algorithms.
When predicting the subcellular localization of proteins from their amino acid sequences, there are basically three approaches: signal-based, global property-based, and homology-based. Each of these has its advantages and drawbacks, and it is important when comparing methods to know which approach was used. Various statistical and machine learning algorithms are used with all three approaches, ...
متن کاملSVM -based method for subcellular localization of human proteins using amino acid compositions, their order and similarity search
Running title: SVM-based method for subcellular localization of human proteins SVM-based method for subcellular localization of human proteins 2 Summary Here we report a systematic approach for predicting subcellular localization (cytoplasm, mitochondrial, nuclear and plasma membrane) of human proteins. Firstly, SVM based modules for predicting subcellular localization using traditional amino a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of molecular biology
دوره 300 4 شماره
صفحات -
تاریخ انتشار 2000